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Abstract 

It is well known that Multiple-Input Multiple-Output (MIMO) systems have high spectral efficiency, especially 
when channel state information at the transmitter (CSIT) is available. When CSIT is obtained by feedback, it is 
practical to assume that the channel state feedback rate is finite and the CSIT is not perfect. For such a system, we 
consider beamforming and power on/off strategy for its simplicity and near optimality, where power on/off means 
that a beamforming vector (beam) is either turned on with a constant power or turned off. The main contribution 
of this paper is to accurately evaluate the information rate as a function of the channel state feedback rate. Name a 
beam turned on as an on-beam and the minimum number of the transmit and receive antennas as the dimension of 
a MIMO system. We prove that the ratio of the optimal number of on-beams and the system dimension converges 
to a constant for a given signal-to-noise ratio (SNR) when the numbers of transmit and receive antennas approach 
infinity simultaneously and when beamforming is perfect. Asymptotic formulas are derived to evaluate this ratio 
and the corresponding information rate per dimension. The asymptotic results can be accurately applied to finite 
dimensional systems and suggest a power on/off strategy with a constant number of on-beams. For this suboptimal 
strategy, we take a novel approach to introduce power efficiency factor, which is a function of the feedback rate, to 
quantify the effect of imperfect beamforming. By combining power efficiency factor and the asymptotic formulas 
for perfect beamforming case, the information rate of the power on/off strategy with a constant number of on-beams 
is accurately characterized. 
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Index Terms 

MIMO, finite rate feedback, power on/off, beamforming 

I. Introduction 

This paper considers multiple-input multiple-output (MIMO) systems with finite rate channel state 
feedback. Multiple- antenna wireless communication systems, also known as MIMO systems, have high 
spectral efficiency. It is also well known that the capacity of MIMO systems with channel state information 
(CSI) at the transmitter (CSIT) is generally higher than the systems without it. When perfect CSI is 
available at both transmitter and receiver (CSITR), the MIMO channel can be viewed as a set of parallel 
sub-channels. The transmission power on each sub-channel obeys water filling principle [1]. If CSIT is 
obtained from channel state feedback, however, perfect CSIT requires infinite feedback rates, which is 
not practical. On the other hand, in practical systems such as UMTS-HSDPA [2], there is a control field 
which can be used to carry a certain number of channel state feedback bits on a per-fading block basis. 
It is reasonable to consider MIMO systems with finite rate channel state feedback. 

For a given feedback rate, this paper tries to answer two basic questions, how much benefit the feedback 
can bring and how to exploit the feedback to achieve that benefit. It is difficult to answer these two questions 
in general. To achieve or calculate the information rate for a given feedback rate, the optimal transmission 
strategy and the optimal feedback strategy need to be found. It has been shown in [3], [4] that the design of 
transmission and feedback strategies is an unconventional optimization problem. For memoryless channels, 
it is proved that the information theoretic limit can be achieved by memoryless transmission and feedback 
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strategies. However, the explicit forms of the optimal strategies are still unknown. Lloyd algorithm is 
resorted to obtain suboptimal numerical solution in [3], [4]. 

On the other hand, the optimization problems can be simplified if the transmission strategy is restricted 
to power on/off strategy (with beamforming). In a general setting, the optimal transmission strategy is to 
choose the covariance matrix of the transmitted Gaussian coded symbols according to the current feedback 
[3], [4]. By the singular value decomposition, the covariance matrix can be decomposed to a unitary matrix 
and a non-negative diagonal matrix which are called as beamforming matrix and power control matrix 
respectively. We describe each column vector of the beamforming matrix as a beam and the diagonal 
element corresponding to a beam as the power on that beam. The power on/off strategy means that a 
beam is either turned on, i.e., its power is a positive constant P on , or turned off, i.e., its power is zero. 
As we will show later, the power on/off and beamforming assumption simplifies the analysis. Although 
power on/off is suboptimal, it has been shown in [5] and [6] that power on/off can achieve performance 
close to water filling power control for single antenna systems and parallel Gaussian channels respectively. 
This paper will show that power on/off is near optimal for MIMO channels as well. 

The main contribution of this paper is to accurately characterize the information rate of the power on/off 
strategy with finite rate channel state feedback. Name a beam turned on as an on-beam. The optimization 
problem corresponding to power on/off strategy is to find the optimal number of on-beams, which is 
related to power control, and the directions of the on-beams, which is called as beamforming, according 
to the channel realization. Both power control and beamforming have influence on the overall information 
rate. By analyzing these two effects separately, this paper is able to characterize the overall information 
rate accurately. 

To isolate the effect of beamforming, we first discuss the perfect beamforming case. Perfect beamforming 
means that the beamforming matrix at the transmitter changes the MIMO channel to parallel channels 
without interference. We analyze this case by asymptotics, where the numbers of the transmit and receive 
antennas approach infinity simultaneously. The derived asymptotic results are as follows. 

• Define the minimum number of transmit and receive antennas as the dimension of a MIMO system. 
We prove that the ratio of the optimal number of on-beams and the system dimension converges to 
a constant for a given signal-to-noise ratio (SNR) and perfect beamforming. This result suggest a 
power on/off strategy with a constant number of on-beams. The assumption of a constant number of 
on-beams is crucial to analyze the effect of imperfect beamforming. 

• We also prove that the optimal number of on-beams is a non-decreasing function of SNR. 

• We derive asymptotic formulas to simplify the calculations. By following the method developed 
in [7], [8], we derive asymptotic formulas to evaluate the optimal number of on-beams and the 
corresponding information rate, which are obtained by simulation traditionally. Furthermore, for the 
CSITR case, asymptotic formulas are derived to calculate the Lagrange multiplier required for water 
filling power control and the corresponding channel capacity for the first time. 

It is noteworthy that the asymptotic results are accurate enough for MIMO systems with finite many 
antennas. 

Then we quantify the effect of imperfect beamforming accurately by assuming a constant number of 
on-beams. There are many works studying similar problems. Some works add some structures to make 
the MIMO system equivalent to a single-input single-output (SISO) system. The structures could be 
single receive antenna [9]— [13] and single beam for a single data stream [14]— [17]. For MIMO systems 
with multiple beams, transmit antenna subset selection is viewed as a special case. Different antenna 
selection criteria are proposed in [18], [19] and the effect on information rate is analyzed for extreme 
SNR regimes in [20], [21], whose analysis is hard to be generalized to other SNR regimes. For general 
multiple beams, assuming that the transmitter knows some singular vectors of the channel matrix perfectly, 
power allocation to maximize information rate is discussed in [22] and beamforming matrix selection 
to minimize Bit Error Rate (BER) is proposed in [23]. More practically, if the information about the 
channel state is obtained through a finite rate feedback, it is reasonable to assume that the transmitter 
only knows quantized information about the channel state. The popular strategy is to construct a finite 
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size beamforming codebook and select a beamforming matrix for transmission according to the channel 
state feedback. The algorithms to construct a beamforming codebook are proposed in [24]-[26]. The 
beamforming codebook design criteria and the beamforming matrix feedback criteria, which are often 
coupled, are discussed in [23], [27]-[30]. Based on Grassmann manifolds, the effect of finite beamforming 
on performance is analyzed in [27], [29] and refined later in [31], all of which are based on Barg's formula 
[32] which is only valid for MIMO systems with asymptotically large number of transmit antennas but 
fixed finite receive antennas. Applied for all MIMO systems, the performance of finite beamforming is 
analyzed for high SNR region in [30], which is difficult to be generalized to other SNR regimes. Valid for 
all SNR regimes, the information rate is quantified in [33], [34] by letting the numbers of transmit and 
receive antennas approach infinity simultaneously and applying extreme order statistics. The proposed 
formula over-estimates the performance. A correction of the result is in [35]. In the presenting paper, 
we take a novel approach by introducing the power efficiency factor to quantify the effect of imperfect 
beamforming. The power efficiency factor can be calculated using a closed form formula derived in [36], 
which is valid for MIMO systems with arbitrary number of antennas. As a result, the information rate is 
accurately analyzed as a function of feedback rate. The analysis matches the simulations almost perfectly 
for all SNR regimes. 

Finally, we show the near optimality of the power on/off strategy with a constant number of on-beams 
by comparing it with a general power on/off strategy. For a general power on/off strategy, we derive the 
optimal feedback strategy for a given arbitrary beamforming codebook. Then we are able to compare the 
two different power on/off strategies numerically. Simulations show that a constant number of on-beams 
is near optimal for all SNR regimes. Therefore, power on/off strategy with a constant number of on-beams 
provides a simple but near optimal solution. 

This paper is organized as follows. The system model and the related design problem are outlined in 
Section HH where preliminary knowledge about random matrices and Stiefel and Grassmann manifolds are 
also presented. In Section HID the power on/off strategy with a constant number of on-beams is derived as 
the optimal solution for perfect beamforming. Section ITV1 considers the effect of imperfect beamforming 
due to finite rate channel state feedback. Section |V| shows that a constant number of on-beams is also 
near optimal for imperfect beamforming. Conclusions are given in Section |VT| 

II. Preliminaries 

In this section, we first describe the system model. Then we present some preliminary knowledge about 
random matrices and Stiefel and Grassmann manifolds. 

In this paper, we use Z + to denote the set of positive integers, IR fe and C fe to denote the /c-dimensional 
real and complex vector spaces respectively, C kxl to denote the vector space of k x I complex matrices, 
l k to denote the k x k identity matrix, to denote the conjugate transpose of a matrix A, tr(-) to 
denote the trace of a matrix, rank (■) to denote the rank of a matrix, ||-|| F to denote the matrix Frobenius 
norm, | - 1 to denote the determinant of a matrix or the cardinality of a set according to its context, Ex [■] 
to denote the expectation with respect to the random variable X, arg max and arg min to denote the 
functions that return the global maximizer and minimizer respectively. 

A. System Model and the Corresponding Design Problem 

A communication system with L T -transmit antennas and L^-receive antennas is shown in Fig. [T] Let 
T G C Ltx1 be the transmitted signal, Y G C Lrx1 be the received signal, H G C LrxLt be the channel 
state matrix and Z G C Lrx1 be the Gaussian noise with zero mean. The system model can be expressed 
as 

Y = HT + Z 

where E [ZZ f ] = I Lr . In this paper, the Rayleigh flat fading channel is considered: the entries of H 
are independent and identically distributed (i.i.d.) circularly symmetric complex Gaussian variables with 



4 



zero mean and unit variance (£/V(0, 1)) and H is i.i.d. for each channel use 1 . At the beginning of each 
channel use, the channel state H is assumed to be perfectly estimated at the receiver, then quantized to 
finite bits and fed back to the transmitter through a feedback channel. The feedback channel is assumed 
to be error-free and zero-delay. The rate of the feedback is up to Rn, bits/channel use. After receiving the 
channel state feedback, the transmitter transmits the encoded signal according to the current feedback 2 . 

The general design problem for finite rate channel state feedback is difficult to solve. It is well known 
that the optimal transmitted signal should be circular symmetric Gaussian signal with zero mean and 
co variance matrix adapted according to the feedback [3]. Define the covariance matrix of the transmitted 
signal as S = E [TT f ] , the codebook of the covariance matrices as 

£ s = {Si G C LtxLt : 1 < i < 2 Rfb ] 

and the feedback function cp (•) as a mapping from the space of H to a index set {i : 1 < i < 2 Rfb }. 
The corresponding optimization problem is to find the optimal codebook £>£ and the optimal feedback 
function ip (•) to maximize the information rate 

max max Eh [in 1 1 + HS^hjH^I] , 
with the average power constraint 3 p 

E H [tr(E„ (H ))] < P- 

It has been shown in [3] that the design of covariance codebook and the design of feedback function are 
two coupled optimization problems and difficult to solve. 

To obtain analytic solution that reflects the influence of feedback rate on the information rate, we 
simplify the general problem to suboptimal power on/off strategy (with beamforming). In the later parts of 
this paper, we'll show that power on/off strategy is near optimal. Denote the singular value decomposition 
of the covariance matrix as S = QPQ^ where the matrices Q and P are called as beamforming matrix 
and power control matrix respectively. Describe the column vectors of Q as beams. Name the beam 
corresponding to a positive power as an on-beam. The statistics of the transmitted signal is uniquely 
determined by the on-beams and the power on them. In our power on/off model, every on-beam corresponds 
to a constant power P on . The transmitted Gaussian signal T can be expressed as 

T = QX 

where X is random Gaussian vector with zero mean and covariance matrix P on I s , s is the number of on- 
beams and the beamforming matrix Q G C LtXS is composed of the s on-beams and satisfies Q^Q = I s . 
The system model for power on/off strategy is given by 

Y = HQX + Z. 

The optimization problem for power on/off strategy is stated in Problem [T] Since the number of on- 
beams s is the rank of the beamforming matrix Q, the feedback only needs to specify Q. Denote the 
codebook of beamforming matrices as B = jc^ G C LtXS : Q^Qi = I s , < s < L T , 1 < i < 2^}. The 

feedback function is a mapping from the space of H to the index set {i : 1 < i < 2 Kfb }. 

Problem 1: (Power On/Off Strategy Design Problem) Find the optimal beamforming codebook B, 
feedback function ip (■) and P on to maximize the information rate, 



max max max Eh 

Pon B ip(.) 



In 



I + P on HQ v(H) Q; (H) Ht 



'This is a suitable model for the block fading channel when the channel state can be estimated and fed back at the beginning of each 
fading block. 

2 For i.i.d. channel states, the memoryless transmission and feedback strategy can achieve the information theoretic limit provided by the 
finite rate channel state feedback [3]. 

3 The average power constraint p is also the average received SNR because the variance of Gaussian noise is normalized to 1. 
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with the average power constraint 

= Po„E H [s] < P, 

where s = s (H) = rank (Q^h)) is the number of on-beams for a channel realization H. 

As we will show later, the power on/off assumption is the key to decouple the beamforming codebook 
design and feedback function design. 



E 



H 



^ontr f Q ¥ ,(H)Q v(H ) 



B. Random Matrix Theory 

In this subsection, we review relevant results on the spectra of large random matrices. Recall that H is 
an Lr x Lt random matrix with i.i.d. complex Gaussian entries with zero mean and unit variance. Define 
m = min {L T , L R } and n = max {L T , L R }. Define 



„, a f —HE 
W = 1 — H 1 "] 



, | ^HHt if L R < L T 
"^H t H if L R > L T 



Let be the set of the eigenvalues of W. Define the empirical eigenvalue distribution of W as 

F(A)Al|{>-: A,-<A}|. 

lib 

Then as m and n approach infinity simultaneously with r = ^ fixed, 

l im dFV) = i ^V(A + "A)(A-A-) for AG [A" A+] (1) 
d\ I otherwise 

almost surely where ^ = (y/r ± l) 2 [37]. Furthermore, consider a spectral statistical function with the 
form 



5(W) = ^5>(A,). 
If g is continuous and bounded on [A - , A + ], then 



m 



lim g(W) = I g(X)dF(X) (2) 

(n,m)—» oo 



almost surely [7], [8], [37]. 



C. Stiefel and Grassmann Manifolds 

Stiefel manifold and Grassmann manifold are the geometric objects relevant to the beamforming 
codebook design. The Stiefel manifold S LTyS (C) (where L T > s) is the set of all complex unitary L T x s 
matrices Sl t , s (C) = {Q G C LtXS : Q^Q = I s }. Define an equivalence relation on the Stiefel manifold, 
i.e., two matrices P, Q G <Sl t ,s (C) are equivalent if their column vectors span the same subspace. The 
Grassmann manifold Ql t , s (C) is defined as the quotient space of Sl t , s (C) with respect to this equivalent 
relation. It can also be viewed as the set of all the s-dimensional planes through the origin in the Lt- 
dimensional Euclidean space [38], [39]. A generator matrix Q G S LTtS (C) for an s-plane Q G Ql t , s (C) 
is defined as the matrix whose columns span Q. The generator matrix is not unique. If Q is a generator 
matrix for an s-dimensional plane Q G Q Lt ' s (C), then QU with U G S StS is also a generator matrix of 
the same plane Q [38]. 
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This paper considers the projection Frobenius metric (chordal distance) on the Grassmann manifold 
because it is relevant to the the performance analysis of power on/off strategy. The chordal distance 
between two s-planes Qi, Q 2 £ Gl t ,s (C) can be defined by their generator matrices, 

Q1Q1 — Q2Q2 

F 

= = s - trace M QtQa) (Q t iQ 2 )^ , (3) 

where Q x and Q 2 are the generator matrices of Qi and Q 2 respectively [38]. Since the chordal distance 
is independent with the choice of the generator matrices, it is well defined [38]. 

The invariant measure and the uniform distribution play a crucial role in the statistics on Sl TjS (C) and 
Gl t ,s (C). Let M. be a measurable set in Sl t , s (C) or Gl t ,s (C), a measure £ is called invariant if 

C(AA4)=C(.M) = C(.MB) 

for arbitrary Lt x unitary matrix A and s x s unitary matrix B. The invariant probability measure 
defines the uniform distribution on Sl t>s (C) or Gl t ,s (C) [32], [40]. 

III. Power On/Off Strategy with Perfect Beamforming 

To isolate the effect of power on/off from the effect of imperfect beamforming, this section discusses 
the perfect beamforming case. The effect of imperfect beamforming will be treated in Section |TV| 

In this section and throughout, the following notations are used. Define m = min (Lt, Lr) and n = 
max (Ly, Lr). Define the normalized number of on-beams as s = — s and the normalized on-power as 
P on = m p on . Define W = ^HH f if L R < L T or W = ^HtH if L R > L T . Denote the i th largest 
eigenvalue of W by Aj. 

To analyze the perfect beamforming case, Section IIII-AI describes the corresponding optimization prob- 
lem, Section Illl-Bl solves the optimization problem by letting Lt and Lr approach infinity simultaneously, 
and Section IIII-CI shows that the asymptotic solution is near optimal for MEMO systems with finite many 
antennas. 




A. The Design Problem with Perfect Beamforming 

The definition for perfect beamforming is given as follows. Consider the singular value decomposition of 
the channel state matrix H = UAV^. Perfect beamforming means that for VH £ C LrxLt and 1 < s < Lt, 
there exists Q £ B such that the s columns of the beamforming matrix Q £ C LtXS are some columns of 
the right singular-vector matrix V, i.e., V^Q £ C LtXS is with elements either 1 or 0. 

With perfect beamforming, the optimization problem can be simplified. Suppose that P on and s = s (H) 
are given. For a channel realization H, the optimal feedback beamforming matrix is Q^(h) = V s 4 where 
V s is composed of the right singular vectors corresponding to the largest s singular values of H. Then, 
the mutual information between the transmitted signal and the received signal is 

1(H) = m|l iH + P on HQ v(H) Q; (H) Ht 

s 

= ^ln(l + P on A,). (4) 

i=l 

The corresponding optimization problem is stated as follows. 

4 Rigorously speaking, the beamforming matrix Q = V S U for any s x s unitary matrix U is optimal. 



7 



Problem 2: (Power On/Off Design with Perfect Beamforming) Find the optimal s = s (H) (or s 
s (H)) function and P on (or P on ) to maximize the information rate, 



max max Eh 

Pon »(•) 



^ln(l+P on A,) 



i=i 



with the power constraint 



E H [sPon] = -Pon Eh [s] < p. 
The following theorem gives the form of the optimal s function to solve Problem |^1 
Theorem 1: The optimal s function to solve Problem is of the form 

s=-\{k: X k >K}\ (5) 
m 

where k is the appropriate threshold chosen to satisfy the average power constraint 

PmE H [s] = p. 

Proof: See Appendix lAl ■ 
The intuition behind the proof is that all the "good" beams (corresponding to A > k) and only the 
"good" beams should be turned on. This intuition will be used in the proof of Theorem ]§\ later. 

Although the form of the optimal s function is given in ©, it is difficult to find the key parameters (the 
optimal Pon and k) and the corresponding information rate X. Different from the water filling solution 
for CSITR case where the Lagrange multiplier is uniquely determined by p [1], power on/off strategy has 
uncountable many pairs of P on and k corresponding to the same p. Numerical search may be employed to 
find the optimal P on , k and the corresponding X. However, if the numbers of transmit and receive antennas 
approach infinity simultaneously, as we will show in Section ITll-Bl the corresponding key parameters and 
information rate can be explicitly computed. 



B, MIMO Systems with Infinitely Many Antennas 

This section provides explicit formulas to solve Problem El by letting the numbers of transmit and 
receive antennas approach infinity simultaneously. As a byproduct of the employed method, this section 
also presents asymptotic formulas for the capacity of CSITR case. According to the authors knowledge, 
the derived asymptotic formulas are presented for the first time. 

1 ) Asymptotic Analysis for Power On/off Strategy: The main result of the asymptotic analysis is the 
following theorem, which gives the optimal s function when the numbers of transmit and receive antennas 
approach infinity simultaneously. 

Theorem 2: Define r = — . For a given SNR p, if m and n approach infinity simultaneously with r 
fixed, the optimal s function converges to a constant, 

= lim s = / / (A) dX, 

(n,m)^oo J K 

almost surely, and the corresponding normalized information rate J = —X also converges to a constant, 



A 

(n,m)—> oo 

where 



A+ 



Joo = lim X = / In ( 1 + —A ) / (A) dX, (6) 



/(A) = 2^V(A + -A) (A-A-), 

A ± = (-y/r ± l) 2 and A + > k > A - is the appropriate constant chosen to maximize the normalized 
information rate ©. 
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Proof: Recall the optimal s function in ©. According to © in Section Hl-Bl 

lim s = lim — \{k : Xk > k}\ 

(n,m)— >oo (n,m)— >oo77l 

lim 1 - F (A) 

(n,m)— >oo 
A+ 

f(X)d\ 

J K 

almost surely. 

For any positive constant P on and a channel realization H, according to the random matrix theory in 
©, the normalized mutual information between the transmitted signal and the received signal converges 
to a constant, 



1 s 

lim J(H) = lim — } In (l + P on A;) 



(n,m)— >oo (n,m)— rooTYl 

i=l 



= / ln(l + P on A)/(A)dA, 
almost surely. Thus the normalized information rate converges to a constant 



lim X = lim E H [X (H)l 

(n,m)— >oo (n,m)^oo 



/ ln(l + P on A)/(A)rfA. 



Furthermore, an elementary calculation shows that the choice of P on = p/^oo satisfies the average power 
constraint. Therefore, we have 



Too = lim 1= f In f 1 + ^A^ / (A) dX. 



Finally, s^, P on and 2^ are all functions of k, the optimization problem is to choose appropriate n to 
maximize X^. ■ 

This theorem proves that the optimal normalized number of on-beams s converges to a constant 
independent of the specific channel realization for a given SNR requirement. The principle behind this 
theorem is same as that of channel hardening [41]: the characteristic of a MIMO channel turns to be 
deterministic as the numbers of transmit and receive antennas approach infinity. 

To find explicit formulas to calculate the key parameters and the corresponding performance, we need 
the following variable change 

X(t) = -(l + y-2y/yco8(t)), (7) 

y 

where y = — = - and t E [0, n]. After the variable change, the asymptotic empirical density function of 
t can be written as 

1— cos(2t) •£ -i 

if y < 1 
if y 



fT®={ r + 4 t r^ cos(t) - - (8) 



Define a such that X(a) — k where k is the optimal threshold in Theorem |2l Then we have the following 
corollary according to Theorem El 

Corollary 1: If m and n approach infinity simultaneously with y = — fixed, the optimal s function 
converges to a constant, 



Soo= lim 1= f T (t)dt, (9) 

(ra,m)^oo 



9 



almost surely and the corresponding X converges to a constant, 

1^= lim 1= / ln( 1+ P-(l + y -2^cos(t)) ) f T (t)dt, (10) 

(n,m)-»oo Ja \ US J 

where a E [0, it) is chosen to maximize the normalized information rate dTOb . 

Since the variable change © is invertible, to find the optimal k in Theorem |2] is equivalent to find the 
optimal a in Corollary [U The following theorem gives a method to find the optimal a. 

Theorem 3: If m and n approach infinity simultaneously with y = — fixed, then ^jg- = has at most 
one solution in the domain of (0, 7r). The optimal a to maximize is either the unique solution of 
^2° = in (0, n) if it exists, or if ^ for all a G (0, it). 

Proof: See Appendix |B] ■ 

The following corollaries show how the optimal a and the optimal change when the average power 
constraint p increases. The results will be applied to MIMO systems with finite many antennas in Section 

mm 

Corollary 2: If m and n approach infinity simultaneously with y = — fixed, the optimal a to maximize 
2oo is a non-increasing function of p. 

Proof: See Appendix ■ 

Corollary 3: If m and n approach infinity simultaneously with y = — fixed, the optimal number of 
on-beams to maximize 2^ is a nondecreasing function of p. 

Proof: Note that = fx (t) dt which is a monotone decreasing function of a. This corollary 
follows Corollary El ■ 

Based on the above asymptotic results, the design problem for perfect beamforming (Problem can 
be solved. According to Theorem the asymptotic optimal threshold, say a^, can be found by checking 
^r 22 -. The corresponding optimal normalized number of on-beams and the normalized information rate 
can be computed by substituting a m into © and ([TTjt respectively. 

However, the calculations involve integrals, which may be computational complex. To simplify the 
computation, Propositions HE express the integrals as some special functions which are defined by infinite 
series. Generally, the calculation of the series is much easier than numerical integrals. To make the 
expressions clear, the following notations are used. 

r = W, (11) 

a^^, (12) 
P 

1 

2 



w^^-[l+y + a + \/(l + y + a) 2 -Ay), (13) 



u±^(l + y + a-\/(l+y + a) 2 -4y). (14) 



for r cos (a) ^ 1 and 



r = tan — , (15) 

1 — r cos (a) 



„ a 1 / usm (a) , 

9 U 4 tan" 1 ^— , (16) 

1 — m cos a 
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for ucos(a) 7^ 1. There are also three special functions defined by series. The first one is called 
Dilogarithm in literature [42] and defined as 



Li2 ( x )-z^ 



n=l 



for I a; I < 1. We define the other two as 



00 r l e ilt ( l ~ X (u\ k 1 00 2k (u\ k * 



=1 \fc=l fc=Z 



and 



1=1 \ k=l 



-y 



x) 



(17) 



(18) 



(19) 



for \u\ < 1, |r| < 1 and |^| < 1. 

Proposition 1: If m and n approach infinity simultaneously with y = — fixed, the normalized number 
of on-beams Sqo (as a function of a) is given by 

±{7r-a-±sin(a) + ±^ r } if y < 1 ^ 
v ^{Tr-a-sin(a)} if y = 1 

Proof: See Appendix O 

Proposition 2: If ra and n approach infinity simultaneously with y = — fixed, the normalized infor- 
mation rate Zoo (as a function of the threshold a) is given by 

[In (w) - In (a)} s,^ + J + Ji + J 2 if 2/ < 1 
[In (w) - In (a)] Soo + J + h if 2/ = 1 



T 



where 



Jo = — I sin (a) [l — In (l + w 2 — 2u cos (a))l 
7rr [ 



it (ir — a) — I u\9 u 
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and 



J, = 1±-^ [Li 2 ( M e-) - Li 2 



1 -r 2 

J2 = — — — [—2 In (1 — ur) (n — a — 6 r ) + z'Sri (u, r, a) — z'Sri (it, r, —a) 



Proof: See Appendix |E| 
Proposition 3: If m and n approach infinity simultaneously with y = — fixed, is given by 



da 7r 



1 - In 1 + 



SooV 



(l + r 2 — 2r cos (a)) 



Vjd 



where 



and 



l-cos(2a) if y < 1 
1+r 2 — 2r cos(a) " 

1 + cos (a) if y = 1 



irw(\—ur) 

■K—a 



it — a 



l—u 2 q 1—r 2 Q 



u(r—u) 



0u + 



r(r—u) 



■Kw(l — u) TTWU(1 — U) 



if y < 1 
if 2/ = 1- 



Proof: See Appendix |F| 

Following the method in [7], [8], Proposition [l]|3] provide close form formulas to evaluate Soo, and 
In [7], [8], the closed form of the capacity is derived for CSIR only case, where all L T available 
beams are turned on. The results in [7], [8] can be viewed as a special case of Proposition El where a = 0. 
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2) Asymptotic Analysis for CSITR Case: To compare power on/off strategy with water filling power 
control (corresponds to CSITR case), we present asymptotic formulas to evaluate the CSITR capacity. As 
far as the authors know, these asymptotic results are presented for the first time. 

It is well known that water filling power control can achieve the capacity assuming perfect CSIT [1]. 
Let m = min (Lr, Lt), n = max (Lr, Lt), t = — . When m and n approach infinity simultaneously with 
the ratio r fixed, according to ©, the normalized capacity is given by 



C n 



lim 

(m,n)^oo 171 



-c 



A+ 

maxf A - ,~ 
2 



In (Xv) f (A) dX, 



where /(A) = ^i/(A + — A) (A — A ), X^ 1 = (\/t± 1) and v is the Lagrange multiplier chosen to 
satisfy the average power constraint, 

,A+ 



P 



max( A - ,— 



V 



f (A) dX. 



To derive closed forms for the integrals, consider the same variable change as in 0. Then the asymptotic 
normalized capacity is given by 



v 



ln(-(l + y-2Vycos(t)))/ T (t) dt, 



where 




1 ( 1+2/- 



2v^ 



if A < - < A 

— v — 

if - < A 



v is the Lagrange multiplier chosen to satisfy the average power constraint 



P 



V 



f T (t)dt, 



l + y-2v^cos (t) 
and f T (t) is given in ©. 

The following propositions give the closed forms for the average power and the normalized capacity 
as a function of v. To make the presentation clearer, the notations in (II 1 H 1 9ft are used. 

Proposition 4: If m and n approach infinity simultaneously with y — — fixed, the relationship between 
the power constraint p and the Lagrange multiplier v is given by 



4> 



where 



St (*) dt, 



and 



J A 



2tt 



l_ r 2 

7i — a 



TV — a) 



i±d e + i f i l_^\ 

l-r 2 r ^ 2 \l-re~ ia l-re ia ) 



Proof: See Appendix 101 




I) 

-i ( i+v-l 

2v^ 



if A- < i < A- 
— i/ — 

if - < A 



if y < 1 
if y= 1 
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Proposition 5: If m and n approach infinity simultaneously with y = — fixed, the normalized capacity 
Coo is given by 

In (A + J 5 + J 6 + J 7 ify<l 
In ( v - ) + J 5 + J 6 if = 1 



where 



/t (*) dt, 

J 5 = — |sin (a) [l — In (l + r 2 — 2r cos (a))] — r (7T — a) — ^ r j 9, 

J 6 = [Li 2 {re-) - Li 2 (re*)] , 

+2 In (1 - r 2 ) (tt - a - 9 r ) + i [Sr 2 (r, -a) - Sr 2 (r, a)] } , 



cos -i(i±^n if A < i < A+ 



and 

f -i ( A 

1 if i < A". 

Proof: See Appendix IHI ■ 
Based on the above propositions, the Lagrange multiplier v and the corresponding normalized capacity 
Coo can be easily computed for a given SNR requirement p. 



C. MIMO Systems with Finite Many Antennas 

The asymptotic results in Section IHI-BI can be applied to MEMO systems with finite many antennas. 
It is often the case that the asymptotic results are accurate enough for MIMO systems with finite many 
antennas [7], [8], [37], [41], [43]. So are the asymptotic results in Section Ull-BI Theorem El proves that 
the optimal normalized number of on-beams s converges to a constant asymptotically. We will show that 
a constant s is near optimal for MIMO systems with finite many antennas. Moreover, according to the 
asymptotic result in Corollary |3J the optimal s is a nondecreasing function as the average p increases. It 
is consistent with the results in [20], [21], which consider the special case of transmit antenna selection 
and show that at most one beam should be turned on when p is small enough and m beams should be 
on when p is sufficiently large. Importantly though, the results in this paper is more general. 

Before applying the asymptotic results, however, it is worthy to note note the difference between the 
asymptotic case and the case of finite many antennas. In asymptotic case, s can be any rational number in 
[0, 1]. On the other hand, in the case of finite many antennas, s can only take finite many discrete values, 
s E {—,—,-•• ,1} where m = min (L R , L T ) is the dimension of the MIMO system. 

To apply the asymptotic results to the finite case, we use the following procedure. 

1) For a given MIMO system with L^-transmit antennas and L^-receive antennas, define m = min (Lr, Lt), 
n = max (Lr, Lt) and y = — . According to the asymptotic analysis and formulas in Section ITll-Bl 
evaluate the asymptotic optimal threshold a m and the asymptotic optimal normalized number of 
on-beams for a given average SNR requirement p. 

2) If Sqo < then go to 3). Otherwise, we choose the optimal s as the one corresponding to the 
larger X from the adjacent discrete values to Soc. Specifically, let s± = — [msoo] and s 2 = — [^SqoJ 
where ["•] denotes the ceil function and |_-J represents the floor function. Compare the corresponding 
performance X (evaluated by substituting the corresponding a into the asymptotic formula for Xoo in 
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Proposition El) and choose the better one as the optimal s. According the Theorem El the ms beams 
corresponding to the largest ms eigenvalues of W are always turned on independent of the specific 
channel state realization H. The power on each on-beam is P on = -£= and the corresponding X can 
be evaluated by asymptotic formula for 2^ . 
3) If Sqo < — , then at most one beam should be turned on. Put s = — and P on = We turn on/off 
the strongest beam, which corresponds to the largest eigenvalue of W, according to the following 
threshold test, 

on 
Ai ^ K 
off 

where k = J (l + y - (o w )). 

The power on/off strategy designed according to the above procedure is called power on/off strategy 
with a constant number of on-b earns. When the given average SNR p is large enough so that s^ > ^ a 
constant number of on-beams are turned on independent of the specific channel realization H. The only 
exception happens when p is so low that s^ < —, where either the strongest beam is turned on, when 
Ai > k, or no beams is on, when Ai < k. Although this strategy is designed according to the asymptotic 
results, it is near-optimal for MIMO systems with finite many antennas according to the simulation results. 

Simulation results are given in Fig. Eland Fig. EJ The information rate v.s. SNR is presented in Fig. 
Eta) while Fig. Eta) shows the information rate v.s. E b /N . Different MIMO systems with 4 x 2, 4 x 3 and 
4x4 antennas are considered. The solid line and the dashed line are the simulated information rate for 
CSITR case and power on/off strategy respectively. The "x" marker and the "+" marker are the information 
rate calculated according to asymptotic analysis for CSITR case and power on/off strategy respectively. 
The difference among them is almost unnoticeable. To make the performance difference clearer, we also 
define the relative performance as the ratio of the considered information rate and the capacity of a 4 x 2 
MIMO achieved by water filling power control with perfect CSITR. The relative performance for different 
MIMO systems is given in Fig. Etb) and Etb). The simulation results show that power on/off strategy 
(dashed lines) can achieve more than 90% of the capacity provided by water filling power control (solid 
lines) and has significant gain comparing to CSIR case (dash-dot lines) at low SNR. Note that there are 
several vales in the relative performance curves. This is due to the fact that s can only take discrete 
values. Furthermore, the performance evaluated by asymptotic analysis ("x" markers for CSITR case and 
"+" markers for power on/off strategy) is very close to the simulated performance. In conclusion, the 
power on/off strategy is near optimal for all SNR regimes and the corresponding performance can be well 
characterized by asymptotic analysis. 

Since the asymptotic results are accurate for the finite many antennas case, we can also conclude that 
the information rate achieved by power on/off strategy or water filling power allocation grows linearly 
with the system dimension m = min (Lr, Lt) for a given SNR. That is, for a given Lt x Lr MIMO 
system, the normalized information rate X and the normalized capacity C are constants determined by 
the SNR p. The total information rate is that constant multiplied by the dimension m. 

IV. Power On/Off Strategy with a Finite Size Beamforming Codebook 

This section is devoted to quantify the effect of imperfect beamforming due to finite rate feedback. 
Comparing to the capacity for perfect CSITR case, the performance loss of power on/off strategy with 
finite rate feedback comes from power on/off and imperfect beamforming. While Section ITTTI characterizes 
the information rate of power on/off strategy for perfect beamforming, this section will characterize the 
overall information rate by quantifying the effect of imperfect beamforming. 

Recall the power on/off strategy optimization problem in Problem [T] Since the power on/off strategy 
with a constant number of on-beams is simple and near optimal, we focus on the effect of imperfect 
beamforming when the number of on-beams is a constant. For a constant number of on-beams, the 
beamforming codebook contains beamforming matrices of the same rank. Specifically, let the optimal 
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number of on-beams be s and the asymptotic optimal normalized number of on-beams be s^. When 
Soo > — (true for most SNR regimes), 

B = {q, G C LrXS : QlQ, = I s , 1 < i < 2**} . 

The only exception happens when the required SNR is so low that < — (see Section ITlI-CI for details), 
where the codebook contains beamforming vectors and one extra index for the case that the transmitter 
is turned off. In this case, 

B = {q, g C LtX1 : QlQ, = 1, 1 < i < 2 R *> - l} u {Q4 

where Q^, is the artificial notation for the the case that the transmitter is turned off (no beam is on). Since 
there is no beamforming when no beam is on, the matrix has no effect in the analysis of imperfect 
beamforming. Thus the effect of imperfect beamforming can be analyzed for 

B = {q, G C LtX1 : QlQ, = 1, 1 < % < 2 R * - l} , 

which can be viewed as a codebook containing 2 Rih — 1 beamforming matrices of rank 1. Call a beam- 
forming codebook containing beamforming matrices of the same rank as a single rank beamforming 
codebook. The power on/off strategy optimization problem (Problem [TJ is simplified to design a single 
rank beamforming codebook B with size 2 Rih or 2 Rih — 1 and the corresponding feedback function ip (•) 
to maximize the corresponding information rate. 

To solve the optimization problem and make the performance analysis tractable, an asymptotic optimal 
feedback function is introduced and discussed in Section IIV-AI The effect of a single rank beamforming 
codebook with this asymptotic optimal feedback function is well characterized in Section I1V-BI 

A. Feedback Function 

This subsection considers the feedback function for a given single rank beamforming codebook. 

The optimal feedback function is given as follows. When the number of on-beams is a constant s, the 
transmitter transmits a constant power sP on . For a given single rank beamforming codebook, it is easy to 
verify that the optimal feedback function if* (•) is given by 

^ (H) = arg ^ax In (l LR + P^HQ&lH^ . 

However, this paper considers a suboptimal but asymptotic optimal feedback function because the 
corresponding performance can be well analyzed. Consider the singular value decomposition that H = 
UAV^. Define V s as the L T x s matrix composed by the singular vectors in V corresponding to the 
largest s singular values. Then both V s and a beamforming matrix Q G B can be viewed as generator 
matrices of s-planes in Grassmann manifold Gl t ,s (C) (see Section Hl-CI for relative definitions). Denote 
the planes generated by V s and Q as V (V s ) and V (Q) respectively. The feedback function (•) is 
defined as 

y3(H) 4 arg min d c (V (Q*) , V (V,)) 

l<t<[iB| 

= arg max ^((VtQ^VtQ^), (20) 

where d c is the chordal distance between two elements in Gl t ,s (C) 5 . 

The feedback function (l2"0l) is asymptotic optimal. When the size of B approaches infinity, the beam- 
forming codebook B can be constructed so that the chordal distance between V (Q^(h)) and V (V s ) 

5 Ties, the case that 3Qi,Q 2 G B such that Qi / Q 2 but d c {V (Qi) , V (V a )) = min d c (V (Q) ,V (V,)) = d c (V (Q 2 ) , V (V,)), 
can be broken arbitrarily because the probability of ties is zero. 
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approaches zero for any given V s . The information rate achieved by the suboptimal feedback function 
approaches that of perfect beamforming, which is also the limit that the optimal feedback strategy can 
achieve. 

Theorem |U shows a nice property of the asymptotic feedback strategy, which will be used to quantify 
the effect of a given single rank beamforming codebook in Section I1V-BI The following lemma is used 
in the proof of Theorem HJ 

Lemma 1: Let A e C kxk be a Hermitian matrix. If A = UATJt for any k X k unitary matrix U, then 
A = jjl for some constant [i 6 R. 

Proof: For any Hermitian A, there exists a k x k unitary U such that UAU^ = A where A is 
diagonal and with real diagonal elements. But UAU^ = A, then A is diagonal and real. Furthermore, 
put U as a permutation matrix, it is easy to verify that the diagonal elements are identical. ■ 

Theorem 4: Let B be a single rank beamforming codebook with rank s where 1 < s < Lt- Let V s be 
a random matrix uniformly distributed on the Stiefel manifold 5l t ,s (C). Let 



<£(V s ) = arg min d c (V (Q;) , V (V s )) . 

l<i<\B\ 



Then 



where 



s ^(V s )Q^ (Vs ) V s 



/il 



m = i- -£v s K (P(Q*CV.)),PCV.))] 



is a non-negative constant. 

Proof: For any s x s unitary matrix U, 



Q^(v s u) = arg i max tr U QjV s U) (q!V s U 
= arg ^max tr M QjV s ) (qJV 



Q 



Since V s is uniformly distributed on Sl t , s (C), V s U is uniformly distributed on Sl t , s (C) as well [40]. 
Then, 



Vs 



V!Q«v.)QLv.iV. 



U 



UtVtQ 0(Vs) Qt Vs U 



(a) 



(b) 



(c) 



s ^0(V s )^( Vs ) 

t 

V.U^Qflv.tDQifv.u) ( V * U ) 
viQ^(v s) Qt {Vs) v 



where 



(a) follows from the fact that Q^,(v a u) = Q^(v s ), 

(b) follows from the fact that V S U and V s have the same distribution, and 

(c) follows from the variable change from V S U to V s . 

Therefore, EV 



VjQ ( p(v s )Q^( Vs )^ s = f° r some constant /i according to Lemma [T] The constant \x 

sQ#(V s )^0(v s ) 



is non-negative because VjQ^(v s )QL v iV s is non-negative definite. 
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Furthermore, an elementary calculation shows that 

u 

The constant /i in the above theorem is related to the average distortion (defined by squared chordal 
distance) of a quantization on the Grassmann manifold. Particularly, we are interested in the maximum /i 
achievable given a codebook size. This problem is solved in [36] and we state the result as the following. 

Theorem 5: Let B be a single rank beamforming codebook with rank s where 1 < s < Lt- Denote the 
size of B by K. Let V be a random plane uniformly distributed on the Grassmann manifold Gl t ,s (C). 
Define the average squared chordal distance as 



d? c (B) ± 3 



min d\ (V (Q) , V) 



The minimum average squared chordal distance achievable for a given K, say d 2 ini , is defined as 

d>. m{ ± inf dJ(B). 

B: \B\=K 

Assume that K is large. dl in{ can be bounded by 

t 1 logo K r ( -7 I 1 lo S2 K 

^-2"^- < d? cini < -A*Vt 2 — H, (21) 
where t = s (Lt — s) is the number of the real dimensions of Ql t ,s (C), 

f 1 ris (L T -i)\ ., 1 < < Lt 

'1 1 VfLr-s (L T -i)\ - f L T / „ <- r ' v*"^ 



and the symbol < denotes the mam orJer inequality. 

Although this theorem is for asymptotically large K, the bounds (|2TTl are accurate enough for relatively 
small K. For example, it is shown in [36] that the bounds are tight for K > 10 when L T = 4, s = 2 . 
Furthermore, as the number of real dimensions of the Grassmann manifold (2t) approaches infinity, the 
lower bound and the upper bound are asymptotically equal. 

It is noteworthy that Theorem |5] holds for Grassmann manifolds with arbitrary dimensions. In [31], 
approximations to d 2 clVii are developed for s = 1 case and the case that s > 1 is fixed and L T is 
asymptotically large. Indeed, the approximation in [31] for s = 1 is a lower bound on d 2 c - mi . The 
approximation in [31] for fixed s and asymptotically large Lt is neither a lower bound nor an upper 
bound. A detailed comparison of Theorem |5] and the results of [31] can be found in [36]. 

Apply Theorem[5]to TheoremHJ the maximum fi achievable, say /i sup , can be upper and lower bounded. 
This result about the suboptimal feedback function will be employed to analyze the effect of a single rank 
beamforming codebook on information rate in Section I1V-BI 



B. Effect of a Beamforming Codebook 

In this section, the effect of a single rank beamforming codebook is accurately quantified. Thus the 
overall performance of power on/off strategy with finite rate feedback can be well characterized by 
combining the asymptotic results in Section HTD and the effect of a single rank beamforming codebook. 

A lower bound to the information rate is derived first. For a channel state realization H, let Aj be the 
i th largest eigenvalue of H^H and be the eigenvector corresponding to Aj. Then H^H = VAV^ where 
V = [vi, v 2 , • • • , v Lt ] and A = diag [Ai, A 2 , • • • , Xl t }- For a given optimal number of on-beams s such 
that 1 < s < L T , define Y s = [v 1; v 2 , ■ • ■ , v s ] and A s = diag [Ai, A 2 , • ■ • , A s ]. Then, 
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VAV+ > V 







vt 



V.A.Vt, 



where two matrices A and B have the relationship A > B if A — B is non-negative definite. Let Q^(h) 
be the feedback beamforming matrix given by the feedback function (QUI) . We have 

h + PonQ^VAVtQ^H) > h + PonQt (H) V s A s VtQ 0(H) . 

Moreover, the matrices on both sides of the above inequality are positive definite. Because A > B implies 
|A| > |B| for any two positive definite matrices A and B [40], we have 



In 



I + PonQtjHjVAVtQ^H) 



> In 



I + PonQt (H) V s A s VlQ^ (H 



Therefore, the information rate is lower bounded by 



1 



> 



— Eh 

m 

— Eh 

m 

— E H 

m 

— E H 

m 



In 
In 
In 
In 



Il r + PmHQ^HjQt^H 1 
h + PonQ^( H ) HtH Q^(H) 
Is + PonQt (H) VAVtQ^ (H ) 



Is + PonQt (H) V s A s VtQ^ (H 



0(H) 



(23) 



The lower bound is tight under high feedback rate assumption. For the perfect beamforming case, the 
lower bound is indeed the information rate itself. 

Based on this lower bound, an approximation to the information rate can be obtained. Since entries of 
H are i.i.d. CM (0, 1), V s is uniformly (isotropically) distributed on <Sr, TjS (C) and independent with A s 
[39], [40]. By the lower bound in (1231. we have 



1 






E H 


m 




(a) 


I] 




m 


(*>) 




< 


ii 




m 




ii 




m 



In 



I S + PonQt (H) V s A s VtQ^ (H 



E, 



In 



In 



I s + PonVtQ^ (H) Q^ (H) V s A 



Is + -PonE-\ 



VtQ^ (H) Q; (H) V 



A. 



E H [ln|I + /iP on A s 



(24) 



where 

(a) holds because II 



ABI 



I + BA| and V s is independent with A s , 

(b) follows from the concavity of In |-| function [44, prob. 2 on pg. 237], and 

(c) follows from Theorem |U where /i is defined. 

Although the approximation (l24b is neither an upper bound nor a lower bound to the normalized informa- 
tion rate, it gives a good characterization under high feedback rate assumption. In fact, for a 4 x 2 MIMO 
system with feedback rate i?fb = 4bits/channel use, the information rate calculated by 041 is very close 
to that evaluated by Monte Carlo simulation (see Fig. @J. 

The constant [i is called as power efficiency factor. The effect of a finite size beamforming codebook can 
be viewed as decreasing the P on in © to ^P on in d2"4l . Thus for a given codebook size, the beamforming 
codebook should be designed to maximize the corresponding power efficiency factor [L, or equivalently, to 
minimize the average squared chordal distance d 2 c . However, it may be computational complex to design 
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a codebook to minimize d\ directly. In [27], a criterion of maximizing the minimum chordal distance 
between any pair of beamforming matrices (max-min criterion) is proposed to achieve small d 2 . In this 
paper, we adopt the max-min criterion to design beamforming codebook for simplicity. Assuming that 
a beamforming codebook is well designed, the maximum /i achievable can be tightly upper and lower 
bounded as functions of the codebook size according to Theorem |5] Note that K = 2 Rfb when Soo > — 
and K = 2 Rih - 1 « 2 Rfb when < -. We have 

r ( t) i Ha „ „ t i 

where s is the rank of the single rank beamforming codebook, t = s (Lt — s) and rj is given in (l22l) . 
Comparing the imperfect beamforming case to the perfect beamforming case, the effective power loss 
1 — A*sup decays exponentially with a rate proportional to R^/m 2 (specifically, the exact rate of exponential 
decay is s ^_ s ) • ^r)- Thus for practical MIMO systems where m is not large, a few bits may be enough 
to achieve a performance close to CSITR. 

According to the above results, the information rate of a power on/off strategy with a well-designed 
single rank beamforming codebook can be well characterized. For a given Lr x L fi MIMO system with 
finite rate channel state feedback up to i?fb bits/channel use, fi sup can be estimated according to (I2"5t for 
all s's such that 1 < s < Lt- Substitute the bounds on /i sup into the information rate approximation (124ft 
and then use the the asymptotic formulas in Section IIII-B.ll for perfect beamforming case. The optimal 
number of on-beams s and the corresponding information rate X can be calculated. 

Fig. S gives the simulation results for a 4 x 2 MIMO system. The performance curves are plotted 
as functions of R^/m 2 . The simulated information rate (circles) is compared to the information rate 
characterized by the lower bound (solid lines) and the upper bound (dotted lines) of d 2 - mf . The simulation 
results show that the information rate characterized by the bounds (ETT) matches the actual performance 
almost perfectly. Note that the previous approximation proposed in [33], [34], which is based on asymptotic 
analysis and Gaussian approximation, overestimates the information rate (a correction of the result is in 
[35]). Our characterization is more accurate. 



V. Performance Comparison 

While we have shown that power on/off strategy with a constant number of on-beams is near optimal 
for perfect beamforming in Section HID this section will show that a constant number of on-beams are 
near optimal when beamforming is imperfect as well. 

To show the near optimality of a constant number of on-beams, the single rank beamforming codebooks 
are compared to multi-rank beamforming codebooks, which may contain beamforming matrices of different 
ranks. For a multi-rank beamforming codebook 

B = [Qi e C LtXS : QjQi = I s , < s < L T , 1 < i < 2 fifb } , 

define a single rank sub-code with rank s as 

B s = {Qi-. QiEB, rank(Q i ) = s} 

where < s < L T . The multi-rank beamforming codebook B can be viewed as a union of the single 
rank sub-codes B = U^=o^ s - The corresponding power on/off strategy design problem is to find the 
optimal multi-rank beamforming codebook B, feedback function </?(•) and constant P on to maximize the 
information rate with a power constraint p, as stated in Problem [T] 

It is difficult to solve the optimization problem for multi-rank beamforming codebooks. However, for a 
given multi-rank beamforming codebook B and a given P on , the following theorem gives the explicit form 
of the optimal feedback function, say <p (•), to avoid exhaustive search in all possible feedback functions. 
The intuition behind is same as the intuition which we learnt from Theorem [TJ all the "good" beams and 
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only the "good" beams should be turned on. The particular aspect of the following theorem is that "good" 
beams need to be reasonably defined. 

Theorem 6: Consider the power on/off strategy with a given multi-rank beamforming codebook B = 
Us=o @s and a given P on . For a given channel realization H, define l s (H) as the largest mutual information 
achievable for a non-empty sub-code B s 



X s (ED = max In 



I^ + PonHQiQjHt 



where B s ^ 0, < s < Lt (X (H) = 0). Denote the optimal feedback function as </?(•). Then 



ip (H) = arg max In 

i: Q»eB 5 ( H ) 



I Lr + P^HQiQjHt 



where 

5 (H) = max {s : B s ^ <p, I s (H) - T t (H) > (s - t)« for all t s.t. < t < s, fit ^ </>} 
and k is the appropriate threshold to satisfy the average power constraint 

P H [5(H)P on ] =p. 

Proof: See Appendix QJ ■ 
The following examples are direct applications of Theorem |5J 

Example 1: Let B = {Il t , Q</>} where is the artificial notion for the case that the transmitter is 
turned off. Then the optimal power on/off function is to turn on all transmit antennas if 

In (I Lr + P on HHt) > kL t 

and turn off the transmitter if 

In (I Lr + PonHH^ < kLt 
where k is an appropriate chosen threshold to satisfy 

L T P OQ Pr {In (I + P on HHt) > kL t ) = p. 

Example 2: Let \B\ ^ oo and B is constructed so that the beamforming is asymptotically perfect. It 
is easy to verify that the optimal feedback function given by Theorem |6] is same as the one given in 
Theorem [l] for perfect beamforming case. 

Although the optimal feedback function for a multi-rank beamforming codebook is given in Theorem 
|6l it is difficult to find the optimal multi-rank beamforming codebook B, the optimal P on and the 
corresponding information rate. In our simulation, we try different multi-rank codebooks and different 
P on 's and then choose the best one. Specifically, denote K s as the size of the sub-code B s , K s = \B S \. 
We try all possible combinations of [K , K u ■ ■ ■ , K Lt )'s such that K s e Z + U {0} and J2s=o K * < 2Rib - 
For each [K , Ki, • • • , K Lt ], we construct the sub-codes B s 's such that \B S \ = K s for s = 0, 1, • • • , L T 
according to the max-min criterion in [27]. The ultimate multi-rank beamforming codebook is given 
by B = Us= B s . For every multi-rank codebook B, we try different P on 's and search for the optimal 
one. The optimal multi-rank codebook B is chosen from the codebooks corresponding to all possible 

[K ,K ir -- ,K LT y s . 

Fig. E] shows the simulation results. Fig. Eta) compares the information rates of single rank beamforming 
codebooks and multi-rank beamforming codebooks. Fig. |5fb) presents the relative performance, which is 
defined as the ratio of the considered information rate and the capacity of a 4 x 2 MEMO system with 
perfect CSITR. We also present the information rate characterization by the upper bound of d? ciri{ (Section 
I1V-BI) . Simulations show that single rank beamforming codebooks (dashed lines) achieve almost the same 
information rate of multi-rank beamforming codebooks (circles). The performance difference is noticeable 
in very low SNR regime. This is because the power on/off strategy with a single rank beamforming 
codebook is designed according to the asymptotic distribution of eigenvalues of — HH* while the key 
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parameters (P on and k) of power on/off strategy with multi-rank beamforming codebooks are numerically 
optimized according to the actual distribution of — HH^. According to the simulation results, power 
on/off strategy with constant number of on-beams provides a simple but near-optimal solution for finite 
rate channel state feedback. 



VI. Conclusions 

This paper accurately characterizes the information rate of the power on/off strategy with finite rate 
channel state feedback. According to asymptotic analysis, the power on/off strategy with a constant number 
of on-beams is employed and studied. Simulations show that this strategy is near optimal for all SNR 
regimes. We derive asymptotic formulas for perfect beamforming case and introduce the power efficiency 
factor to quantify the effect of imperfect beamforming. By combining a formula for power efficiency factor 
and the asymptotic formulas for perfect beamforming, we characterize the corresponding information rate 
accurately for all SNR regimes. 

An important point that is not mentioned in this paper is the complexity of selecting the feedback 
beamforming matrix in a codebook, which may involve exhaustive search. To avoid exhaustive search, 
beamforming codebooks with certain structure may be considered in future so that the matrix selection 
can be more efficient by employing the structure of the codebook. 



Appendix 

A. Proof of Theorem [7] 

Let's start with the single input single output (SISO) case. In SISO case, H is a scalar and — H^H has 
only one eigenvalue, i.e., A = |H| 2 . Denote the corresponding cumulative distribution function (CDF) as 
Fa (A). Define Vt as the set of A corresponding to the case that the transmitter is turned on. Then any 
deterministic power on/off strategy can be uniquely defined by Vt. Thus the optimization problem is to 
choose an appropriate Lebesgue measurable set Vt C M + U {0} to maximize 



with the power constraint 



log (1 + P on A) dF A (A) 



dF A (A) = p/P on . 



Since F A (A) is continuous, there exists an f2 to satisfy the power constraint. The optimization problem 
is well defined. 

Define Vl* = {A : A > k} such that f n „dF A (X) = p/Pon- For any Lebesgue measurable set f2 C 
K+ U {0} such that j n dF A (A) = p/P on , 

log(l + P on A)cLF A (A)- / log(l + P on A)dF A (A) 



log(l + P on A)dF A (A)- 

'n*-n 

> f log (1 + Pon^) dF\ (A) 



log(l + P on A) dF A (A) 



n-n* 



log (1 + P on n) dF A (A) 



log(l + P on K) 



dF A (A) 



n-n* 



dF A (A) 



(&) 



0. 



where 

(a) follows from the facts that A > k when A G O* - and A < n when A G O - 0*, and 
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(b) holds because J n „ dF A (A) = J n dF A (A) implies J n *_ n dF A (A) = J n _ n , dF A (A). 

Therefore, Vl* is the optimal set and the power on/off strategy defined by £1* is optimal. 

The proof for MEMO case follows the same idea. For an L T x L R MIMO system, denote the vector 
of the ordered L T eigenvalues of H^H as A = [Ai, ■ • ■ , \l t ] where Ai > A 2 > • • ■ > \l t > and the 
corresponding multivariate CDF as Fa (A). Define 



where 1 < k < L T . Then any deterministic power on/off strategy can be uniquely defined by fi fc 's where 
1 < k < Lt- The optimization problem is to choose Lebesgue measurable sets Vt k C (IR + U {0}) m , 
k = 1, 2, • • • , L T , to maximize 



Since Fa (A) is continuous, there exist f^'s to satisfy the power constraint. The optimization problem is 
well defined. 

Define £l* k = {A : Ai > ■ ■ ■ > A^ > k}'s where k is chosen to satisfy the power constraint. For any 
Lebesgue measurable sets VL k C (M + U {0}) m 's satisfying the power constraint, 



where the inequality follows the facts that Afe > k when A^ G fi^ — and \t < k when \ k <E Q k — fi^., 
and the last line holds because of the power constraint. Therefore, the power on/off strategy defined by 
f2£'s is optimal. 

B. Proof of Theorem \3\ 

The following lemma is needed to prove Theorem 

Lemma 2: For a continuous and differentiable function h (x) defined on (a, b), denote the first derivative 
as h (x). If h (x) =0 implies h (x) < 0, then h (x) has at most one zero in its domain. Furthermore, 
denote x as the unique zero if it exists, then h (x) > for all x E (a, x ) and h (x) < for all x E (x , b). 

Proof: h(x) has at most one zero. Let x be a zero of h (x). Since h (x ) < according to the 
assumption, 3e > such that h (x + e) < 0, h (x — e) > and h(x) ^ for all x E (x Q — e, x + e) 
but xq. Now suppose that x\ E (a,b) be another zero of h{x) adjacent to xq. W.l.o.g, we assume that 
x\ > xq. Then xq < xq + e/2 < x\. Note that h [x) is continuous, h [x) crosses the x axis at x\ from 
negative to positive as x increases. Thus ti (x\) > 0. It contradicts with the assumption. 



— {A : the eigen channel corresponding to A^ is on} 




with the power constraint 
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Assume that x Q is the unique zero if it exists. Because of the continuity of h(x), it is easy to verify 
that h (x) > for all x £ (a, x ) and ft, (x) < for all x £ (x , 6). ■ 

To prove Theorem |3J we discuss two cases. One case is that has zeros in (0, it) and the other case 
is that it has no zero in (0, tt). 



Evaluate f or a n a £ (0, 7r). Denote 



da 



z(t) 



P 



ysc 



(l + y-2v^cos (t)). 



(26) 



Then 



d^oo 

da 



d 

da 
h(a) 



P 



hx(l + -^-(l + 2/-2V2/cos(t)) ) / r (t)dt 



l-ln(l + *(a))- / — - 



Define 



J = 1 - In (l + z (a)) - 



1 + z (t) 



/t(0 



dt. 



dt 



(27) 



Because /y (a) > for all a £ (0, 7r), the sign of ^t 22 - is uniquely determined by J when a £ (0, n). 

For the first case that has zeros in (0, n), we argue that ^Sea has a unique zero, say a , in (0, 7r) 
and that 2^ is maximized at a . This can be accomplished by showing that J = implies dJ/da < 0. 
Note that 



dJ 
da 



^W + ^VPinH | Mo) 



1 + z (a) 



1 + * (a) 



/r(a) 
/r(a) 



a (l + z(t)Y 



dt 



z a 



+ 



z(a) + l J a (l + z(t)) 2 



dt 



^2^sin(a) 



J = implies 



Then 



1 + z (a) 
In(l + z(a)) 



(28) 



/r(t) 



dt. 



1 



a (l + z(t)Y 



where the inequality follows the fact that 

1 



dt > 



l + z(t) s c 

" i h (t) 



a 1 + Z(t) Soo 

(1- In (1 + z (a))) 2 



dt 



Mt) dt YMt) 



l + z(t) J a l + z(t) s c 



Thus, 



z (a) 



+ 



z (a) + 1 

> z (a) - 1 
~ z(a) + 1 

> 0, 



fr{t) 



a (! + *(«)) 



2 77 



-dt > 0. 



dt 



+ (1- In (1 + z (a))f 
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where the last inequality follows the facts that z (a) > for a E (0, tt) and that |tj+(1 — In (1 + x)) 2 > 
for x > 0, which can be verified by evaluating the first and second derivatives. Therefore, J = implies 
that the first term of is negative. It is also true that the last term of ( ESI is always negative for 
a E (0, 7r). We have shown that J = implies dJ /da < 0. According to Lemma^l J has a unique zero in 
(0, tt), say a , and J > for < a < a and J < for a < a < n. Since the sign of is determined 
by J, the same conclusion holds for ^i 22 -. Therefore, Zoo has the unique maximum point a in (0, it). 
Furthermore, because of the continuity of Z^, a is also the unique maximum point of Z^ in [0, it). 
For the second case that has no zero in (a, b), we show that Z^ is maximized at a = 0. If 
has no zero in (a, b), J has no zero in (a, 6). But as a — > tt, it can be verified that z (a) — > +oo, 
In (1 + z (a)) — » +oo and J — > — oo. Then J < for a E (0, 7r) because of continuity. Therefore, ^ff- < 
for all a E (0, 7r) and Z^ is maximized at a = 0. 

C. Proof of Corollary 

The proof follows the same idea in the proof of Theorem |3] (see Appendix Q3J. Let J be defined in 
d27t . Then the optimal a to maximize Z^, say ao, should be either the unique zero of J if it exists, or 
if J has no zero in (0, n). We first prove that J = implies ^ < for a given a e (0, n) and p > 0. 
Then we show that a is a non-decreasing function of p. 

For a given a e (0, tt) and p > 0, we prove that J = implies ^ < as follows. Let z (t) be defined 

in (l26l) . Note that z (t) is a function of p. Evaluation of ^ gives 



dp 



z (a) 



1 + z (a) 



z la 



1 + z (a) 



* (0 /r ft) 

;i + ^(t)) 2 



eft 



+ 



1 + z (t) 
kit) 



dt 



J = implies 



In (1 + 2 (a)) 



(1 + z(t)Y Soc 



where the inequality follows the fact that 



l+z(t) 



dt < 1, 

1 + Z(t) 

< 1. Then we have z (a) < e — 1. Furthermore, 



z 2 (t) / r (t) 



(i + z(t)y 



dt > 



z 2 (a) 



z(t) 



> 



z ( a ) 



where the inequality follows from , - 

^ l+z(t) — l+z(a) 



;i+z(o)) 

for all t E (a,7i 



2 ' 



(29) 



(30) 



(31) 



Note that the function -r^— 

l+x 



In (1 + x) + (izj) > for < x < e — 1, which can be verified by checking its first and second 
derivative. Substituting (l30l and (I3TT) into (l29l . we have shown that J = implies ^ < for a E (0, 7r). 

Now let a maximize T M for an SNR p > 0, let a x maximize Z^ for an SNR p 1 > and p < p x . 
In the following, we prove that a\ < a by studying two cases: one is that a > and the other is that 

a = 0. 

by Theorem |3] Since J = implies ^ < for 



10, po 



For the first case that a > 0, we have J 
a = a , J\ ao , Pl < by Lemma El in Appendix iBl But a\ maximizes Z^ at p%. Then either a\ = 0, or 
a\ > and J \ ai>pi = again by Theorem |3] If a\ = 0, ai < a . If ai > 0, then ^ |o<o<o 1 ,pi > an d 
|ai<a<7r,pi < according to the proof in Appendix|Hl Since we have shown J \ a(hPl < 0, ai < a . Thus 

a\ < ao if a o > 0. 

On the other hand, a = implies a\ = a = 0. Suppose that a\ > a , then ai E (0, tt), J | aiiP1 



and J\ai,p > 0. Because J 
a' maximizes Z 



< ai < ao 



— oo, 3a' G (ai,7r) such that J 
for p . It contradicts with the assumption that ao : 
and thus a\ = a = 0. 





„•__,,„ - 0. According to Theorem 
= maximizes Z^ for p . Therefore 
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D. Calculation of 

Write the formula for Soo in © in another form. Recall the definition of fx (t) in ©. It is easy to see 
that fx (t) = fx {—i)- In order to use the symmetry, we define the integral range 

Ir = [-71", -a] U [a, 7r] . (32) 
Then the normalized number of on-beams is given by 

Soo = - / h (t) dt. 
1 JIr 

When y < 1, 



1 2 - e 2lt - e 



2it -2it 



-dt 



2 7 Jfl 2tt 1 + r 2 - 2r cos (t) 
4tt (1 - r 2 ) Jj (l - re u + 1 - re - '* ^ ^ 



„2it „-2it 



) dt, 



where r = v /y. Because of the symmetry of the integral range and the integrand, we have 

1 

'Ir 1 



Then 



Note that 



where 



Then 



J Ir 1 - re" v 



e 2it _ e -«t) dt 



f —^P- 



e 2it _ e -2it) dt 



1 



47T f 1 — r 2 ) ./j V 1 — re lt 



l) (2 



e 2 ^ _ e - 2 ^) df 



47r ( 1 - r 2 



+e 2i * + -e ft + 2 - 2r z - 2re~ lt - e 



Ir 



I \ 2 it 

r / 1 — re lt 



it „-2it 



dt. 



f -^—dt = i\ din (l- re**) =i\n( - 



1 — re 



re' 



— 2$ r 



, / rsm (a) 

9 r = tan" 1 | ^— 

1 — r cos (a 



1 / 1 1 - r 2 
— (7r — a) sin (a) H — 6, 



71 



r 



When y — 1, it is easy to see that 

1 

Sco ~ 2^ 



(1 + cos (t)) dt = — (ir — a — sin (a)) 



7T 



(33) 
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E. Calculation oflao 

The normalized capacity is given by 

X O0 = \ [ ln(l + ^-(l + y-2^ycos(t)))f T (t)dt 
2 Ji R V s ooV J 

where f T (t) is given in ©, the integral range Ir is defined in (l32l) and Soo can be calculated according 
to the Proposition [T] 
Define 

a = , (34) 

P 

then 

In (l + -P-(l + y -2^ycos (t))) 
= In (l + r 2 + a - 2r cos (£)) - In (a) , 

where r = y^. Also define 



1 

2 

and 



w = ^(l + y + a+J(l + ?/ + a) 2 -47/) (35) 



u = -^-(l+y + a- \/(l + y + a) 2 -Ay) , (36) 



2^ 

then it is easy to verify that u < 1 and 

In (l + r 2 + a - 2r cos (t)) 

= In (w) + In (l + u 2 - 2u cos (t)) . 



Therefore, 



and 



Define 

Then 
Note that 



m(l + ^-(l + y-2V£cos (t)) 
= In (10) - In (a) + In (l + u 2 - 2m cos (*)) 



Too = (In (w) - In (a)) So, 

+- / In (l + u 2 - 2u cos (t)) f T (t) dt. 
2 Ji R 

I = - I In (1 + u 2 - 2u cos (t)) / T (t) dt. 
J = (In (w) - In (a))s + J Q . 
In (l + u 2 — 2m cos (*)) = In (1 - ue u ) + In (l - ne"**) 



and 



Then 



/ \n (l - ue lt ) f T (t) dt = [ In (l — ue~ u ) fj 

Jl R JIr 



(t) dt. 



In (1 - ue u ) fx (t) dt. 



Calculate I for the case y < 1 and the case y — 1 respectively. 
When y < 1, 

/o = / \n(l-ue u )-^- / t - - dt 

Jin 27r 



2tt(1 

It is easy to verify that 

e 2it - 2 + e~ 2it 



*n 1 + r 2 — 2r cos (t) 
/ In(l-«e ft ) f— L- + 1 1 (2 



„2it 



1 — re 1 * 



and 



Then 



n 2it 



2 + e 



-2it 



1 — re 



-it 



1 \ 2 re 1 * 
r / 1 — re** 



i y i 

r / 1 — re** 



--e tf -2 + r 2 + re -i * + e~ 
r 



\1 - re 1 * 1 - re~ lt J v y 



Define 



1 

r 

r 



h 
h 
h 



e H h r + e - " + r 



/ In (l - ue ft ) tit 



re 1 
1 — re** 1 — re" 



= [ In (1 - we 1 *) (e l * + e~ lt ) dt 

JIr 

= [ In (1 - ue**) f + — dt, 

J Ir v V 1 - re 1 - re-**/ 



Then 



27rr 



+ r)l 1 + I 2 + r - - /; 



Calculate ii, J 2 and J 3 respectively. Because |u| < 1, 

ln(l-<) = -£ 



fc 



Therefore, 



h 



1r 



Eyk^ikt 



k=l 



OO u 

u 



^ik~ 2 1 de 

k=l JL > 



ikt 



tt G , V Hi C- 



— ^- J L 



k=l 



k=l 



Define 



Li 2 (x) = 2J ' for l x l - 



71=1 



which is usually called dilogarithm function [42]. Then 

Ji = i [Li 2 («e" ia ) - Li 2 (we ia )] . 

To evaluate J 2 , n °te that 



/ ln(l -ue^^dt = / 7 dt + - In (l - ue 1 *) e lt \* 

Ji R V ; Ji R l- ue* 1 v 7 la 

1 



+ T ln (1 - we") e* \l% 



and 



/ In (1 - ue a ) e- if dt = [ — ^—-dt + — In (l - ue lt ) e 
Ji R V ; Ji R l- ue* -1 v 1 



+—\n(l-ue it ) e^ll" . 
— % 



Then 



Note that 



and 



where 



-2 sin (a) In (l + u 2 — 2u cos (a)) + it / 

Ji R 1 - we 4 * 



eft. 



it- 



e 2it _ j 

1 — ue lt 



-e tl - u + I - - u 
u 



ue 



1 — ue 1 



ue 



Ir 



1 — ue' 



;dt = -26 u , 



9„ = tan 



1 ( u sin (a) 
1 — it cos (a) 



by similar analysis that we did in CHt. Then 

J 2 = 2 [— sin (a) In (l + u 2 — 2u cos (a)) 



+ sin (a) — u (n — a) — ( u ) 9 U 



To evaluate I 3 , note that 



ln(l-ue u ) = -J2 



oo / j-t \ k 

ue ) 



k=i 



k 



and 



1 



1 



T- - 1 



1 — re lt 1 — re~ %t 
because |u| < 1 and |r| < 1. Thus 

„ oo / it\k / oo 

-'. = / E^ 

Jl R k=l K 



E 

l=— oo 



r <e 



at 



r W e m dt. 



l=—oo 



Change the order of the double summation. Then 



-h 



EE 



[ue 



k 



[re 



r u ) k+l )dt 



' J R 1=0 \k=l 

00 / 1 ~ 1 („,v'it\ k 



/OO / l— 1 / jf\K OO i At\K 



+ 



E 

1=0 

» oo 

J Ir I_1 



r'e^ 



OO / x k 

(ur) 



E 



v fe=i 
i-i 



OO , s fe 

■ur 



E^^-'E t 



dt. 



fc=i 



Define 



E r ' e "" E 

« z=o \fc=l 



OO / s fc 

■ur 



A: 



and 



„ OO / l-l (y\k OO 

= / E e "' r ' E ^j- + r ~' E 

i=i V fc=i ft k=i 



dt 



(ur) 



dt. 



Noting that < r < 1 and < ur < 1, h is well defined and 



00 / s k 
[ur 



fc=i 



z=i 



Z 



00 / x k 

+ j2 ■ 



k=i 



k 



00 ^ 



-ln(l - ur) (*-iln(l-re _it )) |£ 
-ln(l -ur) (t-im(l - re" 1 ')) |l£ 
-2 In (1- wr) (7T-a-0 r ), 
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where 9 r is obtained according to the similar analysis in (1331) . To evaluate J 5 , we substitute the definition 
of u into -. It is easy to verify that 



^ ( l + r 2 + a-y/(l+r 2 + a) 2 -4r 2 



u 1 
Ir 
1 

\ 
1 



< 2^2 ' 1 + r +«-V( 1 -^ 2 + «)' 



and 



1*1 * E^+^'E^ I* 



oo 

< 



'Z-l oo /hN^ 



.E^IE^ +E^ )* 

Y fl Z=i \fc=l k=l 



' In fl - -) dt. 



i R 1 - r V r 



Therefore, 7 5 is well defined. Further, define a special function in the form of series as 

00 J„ilt ^ 00 r 2fc 



Then 



Z=l \fc=l k=l 



-S ri ( u ,r,*)|: + is ri (u,r,*)|lS 



= iSrx (w, r, a) — zSri (w, r, —a) . (44) 

In conclusion, when y < 1, 

1/1 + r 2 1 — r 2 

Zoo = [In (w) - In (a)] + - — A + h h 

Z7cr \ r r 

= [In (w) - In (a)] 

1 + r 2 ^ 1 1— r 2 ^ 1 — r 2 ^ 

27rr 2 1 2nr 2 27rr 2 4 2nr 2 5 
where Ji, J 2 , h and J 5 can be calculated according to (|41H44I) . 

When y — 1, the calculation can be highly simplified. Substitute fa (t) into 7 > men 



— / \n(l-ue u ) (e u + 2 + e- u ) dt 
271 JIr 



= I/ 1 + J-J 2 , 

7T Z7T 

where Ji and 7 2 are defined in (l39l) and d40b respectively. Thus 

Joo = [ln(w) -ln(a)]soo + -h + — J 2 , 

7T Z7T 

where ixand J 2 can be calculated by (HTTl and d42l respectively. 
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F. Calculation ofdl^/da 

Define Ir, a, w and u as (l32b . (l34b . (l35l) and d36b . It is easy to see that 



da V f( \ 
~T = — It {a). 



According to the formula for the normalized information rate in (TTOb . 

dI °° -lnfl + i(l + y- 2^ cos (a))) /t (a) 



da 



+ - 



By g3, 



2 4 1 + 1(1 + y - 2^cos (t)) h (t) dt 
1 /■ -^|-(l + !/-2^co S (t)) 



2 4 l + I(l + J/ _2 v ^cos(t)) 
= JL/ T (a) /" 1 + ^~ 2 v / ^ cos ft) 



-fr{t)dt 



Define 



then 



da 



jd __ 1 
2 



It (a) 



i R 1 + y + a -2yfy cos (t) 

1 



f T (t)dt 
f T (t)dt). 



'in l + y + a-2 v /ycos (t) 



f T (t)dt. 



1-ln 1 



(l + ^(l+y-2 y /ycos(a))j - V -I d 



Consider the calculation of I d . Since w (1 + u 2 — 2u cos (t)) = 1 + y + a — 2y/ycos (t), 



I d = — 



1 



2w J Ir 1 + u 2 -2u cos (t) 
1 f ( 1 



/t (t) dt 
1 



2w(l 

According to the symmetry of Ir and /t (t), 
Then 

1 



) 7r„ V 1 — we** 1 — ue lt 



-l)f T (t)dt. 



1 



2 W (1-m 2 )4 Vl 



/j 1 — we~ l * 
2 



/t (*) ^. 



we* 



1 /r(t)dt- 



Calculate J d for the case y < 1 and the case y = 1 respectively. 
When y < 1, 



J d = 



47n/; (1 



'Ir 



2 - e 2i * - e- 2U 

1 ; r^dt 



1 — we** / 1 + r 2 — 2r cos (t) 



4nw(l-u 2 ) (1 -r 2 ) 



1 — ue 1 



(45) 



(46) 
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Expand the integrand. Since 



1 



1 — ue lt 1 — re lt 



u 1 r 
+ 



and 



r - til - ue lt r — u 1 — re lt 
1 1 



1 — ue lt 1 — re lt 

11 11 



1 — ur 1 — ue %t 1 — ur 1 — -e 



I d can be split into four parts 



where 



lJ = 4™(l-,l')(l-r») C' + '' + f ' + ''> 



u \ f 2- e 2it - e" 2i * , 

-11/ : r. dt. 



1 — ur r — u ) J j 1 — ue lt 
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r — u J J j 1 — re lt 
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'Ir 

Ig can be easily calculated, 

I 9 = 4 (vr - a) . 

To evaluate J 6 , 7 7 and J 8 , expand the integrands like what has been done in (07b . Note that 



i e « y / re -»* 



1 — re %t 



-it 

it 

_ 1 ,//. 



re , , 



' /( " dt = ■>()„ 



Ir 1 — ue*' 



and 
where 
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and 



6 r = tan 



1 / r sin (a) 



1 — r cos (a) 

according to (l33t . 7e, ^7 an d Is can be calculated and finally I d can be written as 



When y — 1, 



7TW (1 — -ur) 

1 



TV — a ; -V, 



1 - r" 



M (r — w) r (r — u) 



6,. 



Attw (1 — u 2 ) J j V 1 — «e lt 



The integrand can be simplified as 

(1 + u) 2 ue lt 



u 1 — ue 1 



2 + 2u + e 



-it git 



Therefore, 



n — a 



(l+u)6 u 

7TW (1 — U) 7TWU (1 — U) 

Substitute the value of I d into (l46b . ^ can be evaluated. 

G. Calculation of the average power for CSITR case 
For CSITR case, 
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1 + 2/-2 v /?/COS (t) 



VSr 



f T (t)dt 



-f T (t)dt 



2j lR l + y-2^ycos (*)" 

where Jr is defined in (l32l . Soo can be evaluated according to Proposition [T] and the second line follows 
from the fact that the integrand is even. Define 



/m = - 



2j lR l + y-2 y /ycos(t) 

We are going to evaluate J 10 f or y < 1 and y = 1 respectively 
When y < 1, 
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Since 



re 



'1 — re 1 
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1 - re u (1 - re u Y 
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i R (1 - re 4 
1 



and 
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h R (1 - re U Y 
1 f 1 
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1 — r 2 VI — re 1 — re 
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-77 dt 



re 
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-rdt, 



ho can be simplified as 
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7T 



1 — r^ 
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where 



When y = 1, 



# r = tan 



_ 1 / r sin (a) 



1 — r cos (a) 



Ao = -i / 1 + COS j^t 
47r7 /fl l-cos(t) 

1 / 2 

= — —7r + a-\ 7-. 

2tt I tan (f ) 



H. Calculation of the normalized capacity for CSITR case 
For CSITR case, 



J In ^ (1 + r 2 - 2r cos (t))J / T (t) 

In Eoo + \l ln (! + r ' - 2r cos (t)) / T (t) dt, 



where r = ^/y, Jr is defined as in d32b . Soo can be evaluated according to Proposition [T] and the second 
line follows from the fact that the integrand is even. Define 

hi = \ [ In (1 + r 2 - 2r cos (*)) f T (t) dt, 
1 Jjr 

we are going to evaluate hi for y < 1 and y — 1 respectively. 
When y < 1, since 

1 + r 2 - 2r cos (t) = (l - re*') (l - re"") 



and 



/ In (1 - re u ) f T (t) dt = f In (l - re~ u ) f 7 

JIr JIr 
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In can be expressed as 
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Expand the integrand like what have been done in (IT71) and d3%t . then 
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where 
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By similar analysis in (HIT) and (B2l . 

I 12 = i [Li 2 (re" M ) - Li 2 (re ia )] 



and 
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where 



To evaluate J 14 , define 



and 



then J14 = J 1B + 7i 6 . 

It is easy to evaluate I 15 . 
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To evaluate 7 16 , express the integrand in series. Because < r < 1, 
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where 9 r is defined as in d49t . Ji 8 is well defined because 
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Conclusively, 
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where s, J 12 , Ji 3 , ii 5 , Ji 7 and J 18 can be evaluated by (PETl EHH I5U1I521 respectively. 
When y = 1, 
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1 JIr 
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Substitute f T (t) into it, 



I u = — I In (1 - e- u ) (e u + 2 + e _ft ) dt. 
By similar analysis in (HTT) and (H2l . 
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Then the proposition is proved. 



/. Proof of Theorem |6| 

If the optimal number of on-beams s (H) is known, the optimal feedback function is given by 

<p (H) = arg max In 1 Lr + P^HQ^H^ 

i: QjGB s -( H ) 

Thus the only nontrivial part is to prove the optimality of 

s (H) = max {s : l s (H) - X t (H) > (s - t)n for all t s.t. < t < s B t ^ (j) B s ^ 0} 

where 



X s (H) = max In 



I Lfl + P on HQ t QlHt 
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The following lemma is useful to prove the optimality of 5(H). For simplicity, we denote s(H) by 
s from now on. It is necessary to keep in mind that s is not a constant but a function of the channel 
realization H. 

Lemma 3: For Vi > s such that B t ^ 0, X t (H) — X~ s (H) < (t — s) k. 

Proof: Suppose that this lemma is not true. 3t > s such that B t ^ 4> and X t (H) — X~ s (H) > (t — s) k. 
Take the minimum such t and denote it as t , 

t = min {t > s : B t ^ <p X t (H) - J s (H) > (t - s) k} . 

Then Vt s.t. < t < s, 

2ito — = ~ Is + Xg — X t 

> (t — §) K + (§ — t) K 
= (t -t)K, 

where the inequality follows from the definitions of s and t . At the same time, for a t s.t. s < t < t , 
X t — X s < (t — s) k according to the definition of t and the fact that t < t . Then 

2*o — 2* = 2* — Xg + Xg — X t 

> (t — §) K — (t — §) K 
= (t - t) K. 

Thus, X to — X t > (t — t) k for Vt < to and B t ^ 4>, which contradicts with the definition of s. This 
lemma is proved. ■ 

To prove s is optimal, we compare (p(-) with an arbitrary deterministic feedback function ip' (■) 
satisfying the power constraint. Let s' = rank (Q^'(h)) be the number of on-beams according to the 
feedback function <// (■). Let F H (H) denote the CDF of the channel state H. The power constraint can 
be expressed as 

/ s'P OQ dF H (H) = p. 

JC L R xL T 

Define As = s — s' and 

n As = {He C LrxLt : s - s' = As} 
where — L T < As < L T . Since both (p (■) and ip' (■) satisfy the power constraint, we have 



/ As • P on dF H (H) = 0. 

As=-L T 



3S 



On the other hand, the performance difference between (p (•) and cp' (■) is given by 

J 5 (H) dFn (H) 



C L R XL T 



In 
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C L R XL T 
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dF n (H) 
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C L R XL T 



= (X s (H)-X y (H))rfF H (H) 
= J2 [ (^(H)-X s ,(H))rfF H (H) 
+ ^ / (J s (H) - 2,, (H)) rfF H (H) 

As=l 

(6) _1 r L T 

> - Yl / i As i ■ KdFn ( H ) + / As ■ KrfFH ( H ) 

As=-L T ^ n ^ s As =l 
= K Yl AsdF H ( H ) 



As=-L T ' 



(c) 



0. 



where 

(a) follows from the fact that 



In 



Il r + PonHQ^(H)Q^ (H ) Ht 



< max In 
= ls> (H) , 



I Lr + Po.HQ.QjHt 



(b) follows from Lemma |3] and the definition of s, and 

(c) follows from the power constraint. 
Therefore, ip (•) is the optimal feedback function. 
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